A Low-Rank Approximation for MDPs via Moment Coupling
نویسندگان
چکیده
Markov Decision Process Tayloring for Approximation Design Optimal control problems are difficult to solve on large state spaces, calling the development of approximate solution methods. In “A Low-rank MDPs via Moment Coupling,” Zhang and Gurvich introduce a novel framework decision processes (MDPs) that stands two pillars: (i) aggregation, as algorithmic infrastructure, (ii) central-limit-theorem-type approximations, mathematical underpinning. The theoretical guarantees grounded in approximation Bellman equation by partial differential (PDE) where, spirit central limit theorem, transition matrix controlled chain is reduced its local first second moments. Instead solving PDE, algorithm introduced paper constructs “sister”' (controlled) whose moments approximately identical with those focal chain. Because this moment matching, original sister coupled through facilitating optimality guarantees. Embedded into standard soft matching provides disciplined mechanism tune aggregation disaggregation probabilities.
منابع مشابه
Restricted Low-Rank Approximation via ADMM
The matrix low-rank approximation problem with additional convex constraints can find many applications and has been extensively studied before. However, this problem is shown to be nonconvex and NP-hard; most of the existing solutions are heuristic and application-dependent. In this paper, we show that, other than tons of application in current literature, this problem can be used to recover a...
متن کاملValue function approximation via low-rank models
We propose a novel value function approximation technique for Markov decision processes. We consider the problem of compactly representing the state-action value function using a low-rank and sparse matrix model. The problem is to decompose a matrix that encodes the true value function into low-rank and sparse components, and we achieve this using Robust Principal Component Analysis (PCA). Unde...
متن کاملApproximation Algorithms for l0-Low Rank Approximation
For any column A:,i the best response vector is 1, so A:,i1 T − A 0 = 2 n − 1 = 2(1 − 1/n) OPTF 1 OPTF 1 = n Boolean l0-rank-1 Theorem 3. (Sublinear) Given A ∈ 0,1 m×n with column adjacency arrays and with row and column sums, we can compute w.h.p. in time O min A 0 +m + n, ψB −1 m + n log(mn) vectors u, v such that A − uv 0 ≤ 1 + O ψB OPTB . Theorem 4. (Exact) Given A ∈ 0,1 m×n with OPTB / A 0...
متن کاملApproximation Algorithms for $\ell_0$-Low Rank Approximation
We study the l0-Low Rank Approximation Problem, where the goal is, given anm×nmatrix A, to output a rank-k matrix A for which ‖A′ −A‖0 is minimized. Here, for a matrix B, ‖B‖0 denotes the number of its non-zero entries. This NP-hard variant of low rank approximation is natural for problems with no underlying metric, and its goal is to minimize the number of disagreeing data positions. We provid...
متن کاملLow-rank Tensor Approximation
Approximating a tensor by another of lower rank is in general an ill posed problem. Yet, this kind of approximation is mandatory in the presence of measurement errors or noise. We show how tools recently developed in compressed sensing can be used to solve this problem. More precisely, a minimal angle between the columns of loading matrices allows to restore both existence and uniqueness of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Operations Research
سال: 2022
ISSN: ['1526-5463', '0030-364X']
DOI: https://doi.org/10.1287/opre.2022.2392